Evaluating Concrete Strength Model Performance

Using Cross-validation Methods

Sai Devarasheyyt, Mattick, Musson, Perez

2024-07-27

Introduction To Crossvalidation

  • Measure performance and generalizability of machine learning and predictive models.
  • Compare different models constructed from the same data set.

CV widely used in various fields including:

  • Machine Learning
  • Data Mining
  • Bioinformatics
  • Minimize overfitting
  • Ensure a model generalizes to unseen data
  • Tune hyperparameters

Definitions

Generalizability:
How well predictive models created from a sample fit other samples from the same population.

Overfitting:
When a model fits the the underlying patterns of the training data too well.

Model fits characteristics specific to the training set:

  • Noise
  • Random fluctuations
  • Outliers

Hyperparameters:
Are model configuration variables

Nodes and layers in a neural network

Branches in a decision tree

Process

Subsets the data into K approximately equally sized folds

  • Randomly
  • Without replacement

Split The Subsets into test and training sets

  • 1 test set
  • K-1 training set

  • Fit the model to the training data
  • Apply the fitted model to the test set
  • Measure the prediction error

Repeat K Times

  • Fit to all K-1 combinations
  • Test with each subset 1 time

Calculate the mean error

Bias-Variance Trade-Off

K-Fold vs. LOOCV

 

Method Computation Bias Variance
K-Fold Lower Intermediate Lower
LOOCV Highest Unbiased High

K-fold where K = 5 or K = 10 is recommended:

  • Lowe computational cost
  • Does not show excessive bias
  • Does not show excessive variance

 

Model Measures of Error (MOE)

  • Measure the quality of fit of a model
  • Measuring error is a critical data modeling step
  • Different MOE for different data types

By measuring the quality of fit we can select the model that Generalizes best.

\[ \text{MAE} = \frac{1}{n} \sum_{i=1}^n |y_i - \hat{f}(x_i)| \tag{1} \]

  • A measure of error magnitude
  • The sine does not matter - absolute value
  • lower magnitude indicates better fit
  • Take the mean absolute difference between:
    • observed \((y_i)\) and the predicted \(\hat{f}(x_i)\) values
  • \(n\) is the number of observations,
  • \(\hat{f}(x_i)\) is the model prediction \(\hat{f}\) for the ith observation
  • \(y_i\) is the observed value

\[ \text{RMSE} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(y_i-\hat{f}(x_i))^2} \tag{2} \]

  • A measure of error magnitude
  • lower magnitude indicates better fit
  • Error is weighted
    • Squaring the error give more weight to the larger ones
    • Taking the square root returns the error to the same units as the response variable

\[ \text{R}^2 = \frac{SS_{tot}-SS_{res}}{SS_{tot}} = 1 - \frac{SS_{res}}{SS_{tot}} = 1 - \frac{\sum_{i=1}^{n}(y_i - \hat{f}(x_i))^2}{\sum_{i=1}^{n}(y_i-\bar{f}(x_i))^2} \tag{3} \]

  • Proportion of the variance explained by the predictor(s)
  • higher value means better the fit
    • An \(R^2\) value of 0.75 indicates 75% of the variance in the response variable is explained by the predictor(s)

K-Fold Cross-Validation

\[ CV_{(k)} = \frac{1}{k}\sum_{i=1}^{k} \text{Measuer of Errori}_i \tag{4} \]

Leave One Out Cross-validations (LOOCV)

\[ CV_{(n)} = \frac{1}{n}\sum_{i=1}^{n} \text{Measuer of Errori}_i \tag{5} \]

Nested Cross-Validation

Study Data

(I-C Yeh 1998) modeled compression strength of high performance concrete (HPC) at various ages and made with different ratios of components ?@tbl-data. The data used for their study was made publicly available and can be downloaded UCI Machine Learning Repository (I-Cheng Yeh 2007).

Data Exploration and Visulation

  • Target variable:
    • Strength in MPa
  • Predictor variables:
    • Cement in kg in a m3 mixture
    • Superplasticizer kg in a m3 mixture
    • Age in days
    • Water kg in a m3 mixture

All variables are quantitative

Linear Regression Model

Estimate Std. Error t value Pr(>|t|)
(Intercept) 28.2578655 5.1878634 5.446918 1.0e-07
Cement 0.0668433 0.0039668 16.850539 0.0e+00
Superplasticizer 0.8716897 0.0903825 9.644449 0.0e+00
Age 0.1110466 0.0069538 15.969235 0.0e+00
Water -0.1195600 0.0257210 -4.648334 3.9e-06

\[ \hat{Strength} = 28.258_\text{Cement + } 0.067_\text{Superplasticizer + } 0.872_\text{Age } 0.111_\text{Water} \]

Linear Regression CV Results

  • K-Fold Results:
Measure of Error Result
RMSE 12.13
MAE 9.23
R2 0.46

  • LOOCV Results:
Measure of Error Result
RMSE 12.13
MAE 9.23
R2 0.46

  • Nested CV Results:
Measure of Error Result
RMSE 11.87
MAE 9.43
R2 0.49

LightGBM Model

 

Measure of Error Result
RMSE 8.73
MAE 6.82
R2 0.73

 


  • Ensemble of decision trees
  • Uses gradient boosting
  • Final prediction is the sum of predictions from all individual trees
  • Feature importance

LightGBM CV Results

  • K-Fold Results:
Measure of Error Result
RMSE 8.73
MAE 6.82
R2 0.73

  • LOOCV Results:
Measure of Error Result
RMSE 5.93
MAE 4.32
R2 0.87

  • Nested CV Results:
Measure of Error Result
RMSE 8.27
MAE 6.39
R2 0.75

Comparison of Models

  • Performance Comparison:
      Linear Regression vs. LightGBM
  • Advantages and disadvantages
     of each model
Method Measure of Error Linear Regression LightGBM
5-Fold RMSE 12.13 8.73
5-Fold MAE 9.23 6.82
5-Fold R2 0.46 0.73
LOOCV RMSE 12.13 5.93
LOOCV MAE 9.23 4.32
LOOCV R2 0.46 0.87
NCV RMSE 11.87 8.27
NCV MAE 9.43 6.39
NCV R2 0.49 0.75

Modle Comparison K-Fold Plot

Modle Comparison LOOCV Plot

Modle Comparison Nested CV Plot

LightGBM (Light Gradient Boosting Machine)

  • Description: A gradient boosting framework that uses tree-based learning algorithms.
  • Pros: High efficiency, fast training, and capable of handling large datasets.
  • Cons: Requires careful tuning of parameters.

Predictive Performance Comparison

  • LightGBM outperformed traditional models and cross-validation techniques.
  • Lower prediction errors and more reliable performance metrics.
  • Demonstrated strong generalization capabilities.

Computational Efficiency

  • LightGBM: Fast training and efficient computation.
  • Nested Cross-Validation: Excellent performance but computationally intensive.
  • Efficiency crucial for real-world applications with limited resources.

Conclution conclusion

  • Cross-validation techniques and LightGBM effectively reduce overfitting and enhance model accuracy.
  • LightGBM offers superior accuracy and efficiency.
  • Identified key predictors for accurate model development.
  • Robust framework for model evaluation, improving decision-making in concrete design and construction.

Future Research

  • Further refinement of these techniques to improve predictive accuracy.
  • Exploration of additional advanced models.
  • Application in various engineering contexts to enhance model reliability and performance.

References

All figures were created by the authors.

Yeh, I-C. 1998. “Modeling of Strength of High-Performance Concrete Using Artificial Neural Networks.” Cement and Concrete Research 28 (12): 1797–1808.
Yeh, I-Cheng. 2007. Concrete Compressive Strength.” UCI Machine Learning Repository.